Search CORE

10 research outputs found

Is Distributed Database Evaluation Cloud-Ready?

Author: D Agrawal
D Bermbach
D Bermbach
DE Difallah
H Khazaei
J Gray
K Grolinger
PJ Sadalage
S Gilbert
Publication venue
Publication date: 03/10/2017
Field of study

The database landscape has significantly evolved over the last decade as cloud computing enables to run distributed databases on virtually unlimited cloud resources. Hence, the already non-trivial task of selecting and deploying a distributed database system becomes more challenging. Database evaluation frameworks aim at easing this task by guiding the database selection and deployment decision. The evaluation of databases has evolved as well by moving the evaluation focus from performance to distribution aspects such as scalability and elasticity. This paper presents a cloud-centric analysis of distributed database evaluation frameworks based on evaluation tiers and framework requirements. It analysis eight well adopted evaluation frameworks. The results point out that the evaluation tiers performance, scalability, elasticity and consistency are well supported, in contrast to resource selection and availability. Further, the analysed frameworks do not support cloud-centric requirements but support classic evaluation requirements

Crossref

ZENODO

Learning How to Optimize Data Access in Polystores

Author: IP Fellegi
J Duggan
JR Quinlan
JR Quinlan
P Atzeni
PJ Sadalage
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Polystores provide a loosely coupled integration of heterogeneous data sources based on the direct access, with the local language, to each storage engine for exploiting its distinctive features. In this framework, given the absence of a global schema, a common set of operators, and a unified data profile repository, it is hard to design efficient query optimizers. Recently, we have proposed QUEPA, a polystore system supporting query augmentation, a data access operator based on the automatic enrichment of the answer to a local query with related data in the rest of the polystore. This operator provides a lightweight mechanism for data integration and allows the use of the original query languages avoiding any query translation. However, since in a polystore we usually do not have access to the parameters used by query optimizers of the underlying datastores, the definition of an optimal query execution plan is a hard task, as traditional cost-based methods for query optimization cannot be used. For this reason, in the effort of building QUEPA, we have adopted a machine learning technique to optimize the way in which query augmentation is implemented at run-time. In this paper, after recalling the main features of QUEPA and of its architecture, we describe our approach to query optimization and highlight its effectiveness

Crossref

Archivio della Ricerca - Università di Roma 3

Answering GPSJ Queries in a Polystore: a Dataspace-Based Approach

Author: A Corbellini
E Rahm
F Chang
L Wang
M Golfarelli
M Zaharia
MJ Franklin
PJ Sadalage
SJ Thomas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2019
Field of study

International audienceThe discipline of data science is steering analysts away from traditional data warehousing and towards a more flexible and lightweight approach to data analysis. The idea is to perform OLAP analyses in a pay-as-you-go manner across heterogeneous schemas and data models, where the integration is progressively carried out by the user as the available data is explored. In this paper, we propose an approach to support data analysis within a polystore supporting relational, document and column data models by automatically handling both data model and schema heterogeneity through a dataspace layer on top of the underlying databases. The expressiveness we enable corresponds to GPSJ queries, which are the most common class of queries in OLAP applications. We rely on Nested Relational Algebra to define a cross-database execution plan. The plan is composed of several local plans, to be executed on the distinct databases, and a global plan, which combines and possibly aggregates inter-database data. The system has been prototyped on Apache Spark

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

HIFUN - a high level functional query language for big data analytics

Author: AY Halevy
BC Pierce
D Abadi
D Maier
Dominik Ślęzak
M Stonebraker
Nicolas Spyratos
Nicolas Spyratos
Nicolas Spyratos
PJ Sadalage
R Cattell
Tsuyoshi Sugibuchi
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Towards quality analysis for document oriented bases

Author: A Nayak
D Sevilla Ruiz
E Gallinucci
F Abdelhedi
L Wang
M Klettke
M Pušnik
MJ Mior
N Fenton
O Herden
PJ Sadalage
R Copeland
S Abiteboul
TJ McCabe
W Li
Publication venue: HAL CCSD
Publication date: 22/10/2018
Field of study

International audienceDocument-oriented bases allow high flexibility in data representation which facilitates a rapid development of applications and enables many possibilities for data structuring. Nevertheless, the structural choices remain crucial because of their impact on several aspects of the document base and application quality, e.g, memory print, data redundancy, readability and maintainability. Our research is motivated by quality issues of document-oriented bases. We aim at facilitating the study of the possibilities of data structuring and providing objective metrics to better reveal the advantages and disadvantages of each solution with respect to user needs. In this paper, we propose a set of structural metrics for a JSON compatible schema abstraction. These metrics reflect the complexity of the structure and are intended to be used in decision criteria for schema analysis and design process. This work capitalizes on experiences with MongoDB, XML and software complexity metrics. The paper presents the definition of the metrics together with a validation scenario where we discuss how to use the results in a schema recommendation perspective

Crossref

Hal - Université Grenoble Alpes